India’s Invisible Workforce: The Workers Who Don’t Know They’re Building Tomorrow’s Robots

Every time a housekeeper folds a bedsheet, a factory worker assembles a part, or a delivery agent sorts a package — that movement could be worth money to someone building a robot. Not to the worker. To the startup recording them.

This is the new reality of physical AI, and it is unfolding quietly across India’s workplaces, hotels, restaurants, and homes.

At the heart of this week’s controversy is Human Archive — a startup founded by four 20-year-old college dropouts that just raised $8.2 million to do one thing: collect how humans move, and sell it to the companies building robots.


The Gold Rush Nobody Told Workers About

Unlike language models that learn from text scraped off the internet, or image models trained on billions of public photos — robots cannot learn from the web. They need real-world data. They need to watch humans do things, thousands of hours of it, before they can replicate it.

That creates a problem for every frontier lab racing to build the next humanoid robot. And Human Archive spotted the opportunity.

The startup equips workers with camera rigs — downward-facing 4K cameras, depth sensors, tactile gloves, and wrist-mounted devices — that record exactly how human hands perform tasks. That footage is then anonymised, processed through proprietary AI models, and sold as training data to robotics companies and frontier labs.

The buyers? Companies backed by the biggest names in tech. Human Archive’s own investor list reads like a Silicon Valley directory — executives from OpenAI, NVIDIA, Google, Meta, and Anduril have all put money in.


Why India Is Ground Zero

India was not a convenient choice for Human Archive. It was a strategic one.

No other country on earth offers the same combination of industrial diversity and sheer workforce scale. Cofounder Rushil Agarwal put it plainly — India gives you access to industries ranging “from jewellery to textile to coal to steel to everything” within a single country, paired with a labour economy that simply does not exist in the United States.

The result: over 120 partnerships signed across hotels, restaurants, quick commerce platforms, construction sites, and factories — the majority of them in India. The startup has already accumulated tens of thousands of hours of movement data. Its ambition is millions.

India, in other words, is not just a market for Human Archive. It is the engine.


The Controversy That Blew the Lid Off

The startup might have remained under the radar for much longer had it not been for two home services apps.

Pronto was first found to have run a pilot programme using worker tracking data to train AI models inside customers’ homes. The backlash was immediate. Days later, rival platform Snabbit was also linked to a similar test — this one conducted in direct partnership with Human Archive.

Both companies scrambled to clarify. Pronto insisted cameras only enter homes when customers explicitly opt in before each booking. Snabbit’s founder Aayush Agarwal stated they had no intention of ever deploying it in customer homes. Human Archive maintained the simulation was controlled and never real-world.

But the clarifications came too late to contain the conversation. India’s Ministry of Electronics and Information Technology (MeitY) is now reportedly taking cognisance of the situation, with potential regulatory scrutiny on the horizon for startups using home data.


The Questions That Cannot Be Deflected

Do workers actually know what they’re contributing to?

Human Archive says yes — for workers it directly employs. But most data is collected through partner businesses, and the startup openly admits that informing workers is the partner’s responsibility. In India’s largely unorganised blue-collar sector, where workers earn between ₹15,000 and ₹35,000 a month and job security is fragile, “informed consent through a partner” is a standard that is very easy to fall short of.

The more uncomfortable question — whether workers understand they could be training the technology that eventually replaces them — has not been satisfactorily answered by anyone in the industry.

Is the money fair?

Human Archive claims workers earn between $10 and $100 per hour — a significant multiple above market wages. The number could not be independently verified. And even if accurate, the ethical calculus of paying someone today to build the system that displaces them tomorrow is not simply resolved by the hourly rate.

Who bears the legal risk?

Recording inside private homes sits uncomfortably close to the boundaries of India’s Digital Personal Data Protection Act (DPDPA) 2023. The law is clear on consent and purpose limitation. Whether the current practices of any of these companies fully comply is a question that regulators may soon be asking directly.


A Market Too Big to Stop

The controversy will not slow the industry. If anything, it signals how fast the physical AI data market is growing.

Scale AI — partially owned by Meta — already runs a dedicated Data Engine for Physical AI with over 100,000 production hours logged. Build AI, a newer entrant, sells first-person egocentric video datasets aimed at industrial robots. This week, entrepreneur Abhinav Kukreja launched Neocambrian AI with an almost identical pitch to Human Archive — building a large-scale database of human action from India for physical AI training.

The physical AI market is projected to grow at over 47% CAGR between 2026 and 2032, reaching $15.24 billion according to MarketsandMarkets. Every robotics company competing for that market needs data. And the cheapest, most scalable source of that data is India’s workforce.

Related posts